SeqIndex: Indexing Sequences by Sequential Pattern Analysis

نویسندگان

  • Hong Cheng
  • Xifeng Yan
  • Jiawei Han
چکیده

In this paper, we study the issues related to the design and construction of high-performance sequence index structures in large sequence databases. To build effective indices, a novel method, called SeqIndex, is proposed, in which the selection of indices is based on the analysis of discriminative, frequent sequential patterns mined from large sequence databases. Such an analysis leads to the construction of compact and effective indexing structures. Furthermore, we eliminate the requirement of setting an optimal support threshold beforehand, which is difficult for users to provide in practice. The discriminative, frequent pattern based indexing method is proven very effective based on our performance study.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning

Sequential pattern mining is a challenging issue because of the high complexity of temporal pattern discovering from numerous sequences. Current mining approaches either require frequent database scanning or the generation of several intermediate databases. As databases may fit into the ever-increasing main memory, efficient memory-based discovery of sequential patterns is becoming possible. In...

متن کامل

Discovery of Sequential Patterns with Quantity Factors

The sequential pattern mining stems from the need to obtain patterns that are repeated in multiple transactions in a database of sequences, which are related to time, or another type of criterion. This work presents the proposal of a new technique for the discovery of sequential patterns from a database of sequences, where the patterns not only provide information on how these relate to the tim...

متن کامل

A Fuzzy Approach to Sequential Failure Analysis Using Petri nets

In highly competitive industrial market, the concept of failure analysis is an unavoidable fact in complex industrial systems. Reliability of such systems not only depends on the reliability of each element of these systems, but also depends on occurrence of sequence of failures. In this paper, a novel approach to sequential failure analysis is proposed which is based upon fuzzy logic and the c...

متن کامل

Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream

The increased availability and usage of multimedia information have created a critical need for efficient multimedia processing algorithms. These algorithms must offer capabilities related to browsing, indexing, and retrieval of relevant data. A crucial step in multimedia processing is that of reliable video segmentation into visually coherent video shots through scene change detection. Video s...

متن کامل

Logs ?

Web access logs, usually stored in relational databases, are commonly used for various data mining and data analysis tasks. The tasks typically consist in searching the web access logs for event sequences that support a given sequential pattern. For large data volumes, this type of searching is extremely time consuming and is not well optimized by traditional indexing techniques. In this paper ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005